Overview

Dataset statistics

Number of variables32
Number of observations50000
Missing cells147570
Missing cells (%)9.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.2 MiB
Average record size in memory256.0 B

Variable types

Numeric12
Categorical13
Boolean4
Text3

Alerts

ppm has constant value ""Constant
amort has constant value ""Constant
super_conforming has constant value ""Constant
program has constant value ""Constant
relief_refi has constant value ""Constant
prop_val has constant value ""Constant
interest_only has constant value ""Constant
MI_cancel has constant value ""Constant
first_pay is highly overall correlated with mat_dateHigh correlation
mat_date is highly overall correlated with first_pay and 1 other fieldsHigh correlation
orig_cltv is highly overall correlated with orig_ltvHigh correlation
orig_ltv is highly overall correlated with orig_cltvHigh correlation
orig_loan_term is highly overall correlated with mat_dateHigh correlation
channel is highly overall correlated with sellerHigh correlation
seller is highly overall correlated with channel and 1 other fieldsHigh correlation
servicer is highly overall correlated with sellerHigh correlation
fha is highly imbalanced (78.1%)Imbalance
unit_num is highly imbalanced (94.8%)Imbalance
occupancy is highly imbalanced (70.5%)Imbalance
prop_type is highly imbalanced (54.6%)Imbalance
msa has 9256 (18.5%) missing valuesMissing
super_conforming has 48944 (97.9%) missing valuesMissing
prr_loan_seq_num has 44685 (89.4%) missing valuesMissing
relief_refi has 44685 (89.4%) missing valuesMissing
credit_score is highly skewed (γ1 = 75.22765869)Skewed
loan_id has unique valuesUnique
mortgage_ins_pct has 46511 (93.0%) zerosZeros

Reproduction

Analysis started2023-11-13 19:39:12.410741
Analysis finished2023-11-13 19:39:54.530149
Duration42.12 seconds
Software versionydata-profiling vv4.6.1
Download configurationconfig.json

Variables

credit_score
Real number (ℝ)

SKEWED 

Distinct316
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean761.20512
Minimum431
Maximum9999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2023-11-13T11:39:54.739550image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum431
5-th percentile680
Q1736
median771
Q3792
95-th percentile809
Maximum9999
Range9568
Interquartile range (IQR)56

Descriptive statistics

Standard deviation101.53447
Coefficient of variation (CV)0.13338648
Kurtosis6850.1351
Mean761.20512
Median Absolute Deviation (MAD)26
Skewness75.227659
Sum38060256
Variance10309.249
MonotonicityNot monotonic
2023-11-13T11:39:55.008433image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
801 723
 
1.4%
802 716
 
1.4%
797 692
 
1.4%
790 692
 
1.4%
791 669
 
1.3%
798 669
 
1.3%
809 661
 
1.3%
793 660
 
1.3%
786 658
 
1.3%
787 651
 
1.3%
Other values (306) 43209
86.4%
ValueCountFrequency (%)
431 1
< 0.1%
443 1
< 0.1%
470 1
< 0.1%
472 1
< 0.1%
480 1
< 0.1%
486 1
< 0.1%
491 1
< 0.1%
492 1
< 0.1%
494 1
< 0.1%
497 1
< 0.1%
ValueCountFrequency (%)
9999 5
< 0.1%
850 1
 
< 0.1%
835 1
 
< 0.1%
831 1
 
< 0.1%
829 3
< 0.1%
828 1
 
< 0.1%
827 3
< 0.1%
826 4
< 0.1%
825 5
< 0.1%
824 5
< 0.1%

first_pay
Real number (ℝ)

HIGH CORRELATION 

Distinct26
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean200923.57
Minimum200902
Maximum201310
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2023-11-13T11:39:55.232825image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum200902
5-th percentile200903
Q1200906
median200908
Q3200912
95-th percentile201002
Maximum201310
Range408
Interquartile range (IQR)6

Descriptive statistics

Standard deviation35.57835
Coefficient of variation (CV)0.00017707405
Kurtosis1.3647377
Mean200923.57
Median Absolute Deviation (MAD)3
Skewness1.7557016
Sum1.0046178 × 1010
Variance1265.819
MonotonicityNot monotonic
2023-11-13T11:39:55.427038image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
200909 5282
10.6%
200905 4811
9.6%
200904 4471
8.9%
200908 4249
8.5%
200907 4247
8.5%
201002 4181
8.4%
201001 4137
8.3%
200906 4054
8.1%
200912 4015
8.0%
200910 3773
7.5%
Other values (16) 6780
13.6%
ValueCountFrequency (%)
200902 47
 
0.1%
200903 3159
6.3%
200904 4471
8.9%
200905 4811
9.6%
200906 4054
8.1%
200907 4247
8.5%
200908 4249
8.5%
200909 5282
10.6%
200910 3773
7.5%
200911 3347
6.7%
ValueCountFrequency (%)
201310 1
 
< 0.1%
201209 1
 
< 0.1%
201107 1
 
< 0.1%
201012 1
 
< 0.1%
201011 1
 
< 0.1%
201010 4
< 0.1%
201009 6
< 0.1%
201008 9
< 0.1%
201007 5
< 0.1%
201006 8
< 0.1%

fha
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
N
46758 
Y
 
3238
9
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters3
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowN
2nd rowY
3rd rowN
4th rowN
5th rowN

Common Values

ValueCountFrequency (%)
N 46758
93.5%
Y 3238
 
6.5%
9 4
 
< 0.1%

Length

2023-11-13T11:39:55.616954image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T11:39:55.794481image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
n 46758
93.5%
y 3238
 
6.5%
9 4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N 46758
93.5%
Y 3238
 
6.5%
9 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 49996
> 99.9%
Decimal Number 4
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 46758
93.5%
Y 3238
 
6.5%
Decimal Number
ValueCountFrequency (%)
9 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 49996
> 99.9%
Common 4
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 46758
93.5%
Y 3238
 
6.5%
Common
ValueCountFrequency (%)
9 4
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 46758
93.5%
Y 3238
 
6.5%
9 4
 
< 0.1%

mat_date
Real number (ℝ)

HIGH CORRELATION 

Distinct207
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean203587.62
Minimum201411
Maximum204011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2023-11-13T11:39:55.949850image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum201411
5-th percentile202405
Q1203902
median203906
Q3203909
95-th percentile204001
Maximum204011
Range2600
Interquartile range (IQR)7

Descriptive statistics

Standard deviation609.80327
Coefficient of variation (CV)0.0029952866
Kurtosis0.14700177
Mean203587.62
Median Absolute Deviation (MAD)4
Skewness-1.4070794
Sum1.0179381 × 1010
Variance371860.03
MonotonicityNot monotonic
2023-11-13T11:39:56.154741image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
203908 4056
 
8.1%
203904 3797
 
7.6%
203903 3490
 
7.0%
203906 3225
 
6.5%
203907 3221
 
6.4%
203905 3135
 
6.3%
203912 3023
 
6.0%
203911 3017
 
6.0%
204001 2993
 
6.0%
203909 2908
 
5.8%
Other values (197) 17135
34.3%
ValueCountFrequency (%)
201411 1
 
< 0.1%
201710 1
 
< 0.1%
201712 1
 
< 0.1%
201902 19
 
< 0.1%
201903 37
0.1%
201904 43
0.1%
201905 35
0.1%
201906 39
0.1%
201907 30
0.1%
201908 53
0.1%
ValueCountFrequency (%)
204011 1
 
< 0.1%
204010 1
 
< 0.1%
204009 3
 
< 0.1%
204008 2
 
< 0.1%
204007 8
 
< 0.1%
204006 5
 
< 0.1%
204005 7
 
< 0.1%
204004 3
 
< 0.1%
204003 8
 
< 0.1%
204002 109
0.2%

msa
Real number (ℝ)

MISSING 

Distinct430
Distinct (%)1.1%
Missing9256
Missing (%)18.5%
Infinite0
Infinite (%)0.0%
Mean30225.759
Minimum10180
Maximum49740
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2023-11-13T11:39:56.350950image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum10180
5-th percentile12420
Q119124
median31700
Q340060
95-th percentile47644
Maximum49740
Range39560
Interquartile range (IQR)20936

Descriptive statistics

Standard deviation11333.055
Coefficient of variation (CV)0.37494692
Kurtosis-1.2806112
Mean30225.759
Median Absolute Deviation (MAD)9920
Skewness-0.16199969
Sum1.2315183 × 109
Variance1.2843814 × 108
MonotonicityNot monotonic
2023-11-13T11:39:56.575809image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16974 1585
 
3.2%
31084 1090
 
2.2%
33460 869
 
1.7%
47894 816
 
1.6%
12060 814
 
1.6%
41180 717
 
1.4%
42644 663
 
1.3%
19740 646
 
1.3%
38060 627
 
1.3%
38900 624
 
1.2%
Other values (420) 32293
64.6%
(Missing) 9256
 
18.5%
ValueCountFrequency (%)
10180 8
 
< 0.1%
10420 98
0.2%
10500 11
 
< 0.1%
10540 3
 
< 0.1%
10580 122
0.2%
10740 163
0.3%
10780 3
 
< 0.1%
10900 155
0.3%
11020 15
 
< 0.1%
11100 8
 
< 0.1%
ValueCountFrequency (%)
49740 3
 
< 0.1%
49700 15
 
< 0.1%
49660 34
 
0.1%
49620 88
0.2%
49420 23
 
< 0.1%
49340 173
0.3%
49180 107
0.2%
49020 22
 
< 0.1%
48900 108
0.2%
48864 121
0.2%

mortgage_ins_pct
Real number (ℝ)

ZEROS 

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.55514
Minimum0
Maximum35
Zeros46511
Zeros (%)93.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2023-11-13T11:39:56.776830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile17
Maximum35
Range35
Interquartile range (IQR)0

Descriptive statistics

Standard deviation5.96548
Coefficient of variation (CV)3.8359762
Kurtosis13.438167
Mean1.55514
Median Absolute Deviation (MAD)0
Skewness3.8325425
Sum77757
Variance35.586951
MonotonicityNot monotonic
2023-11-13T11:39:56.901696image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
0 46511
93.0%
25 1487
 
3.0%
30 732
 
1.5%
12 662
 
1.3%
17 412
 
0.8%
6 79
 
0.2%
35 59
 
0.1%
20 42
 
0.1%
18 9
 
< 0.1%
21 2
 
< 0.1%
Other values (5) 5
 
< 0.1%
ValueCountFrequency (%)
0 46511
93.0%
6 79
 
0.2%
9 1
 
< 0.1%
12 662
 
1.3%
15 1
 
< 0.1%
16 1
 
< 0.1%
17 412
 
0.8%
18 9
 
< 0.1%
19 1
 
< 0.1%
20 42
 
0.1%
ValueCountFrequency (%)
35 59
 
0.1%
32 1
 
< 0.1%
30 732
1.5%
25 1487
3.0%
21 2
 
< 0.1%
20 42
 
0.1%
19 1
 
< 0.1%
18 9
 
< 0.1%
17 412
 
0.8%
16 1
 
< 0.1%

unit_num
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
1
49421 
2
 
400
4
 
105
3
 
74

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 49421
98.8%
2 400
 
0.8%
4 105
 
0.2%
3 74
 
0.1%

Length

2023-11-13T11:39:57.044789image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T11:39:57.176255image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1 49421
98.8%
2 400
 
0.8%
4 105
 
0.2%
3 74
 
0.1%

Most occurring characters

ValueCountFrequency (%)
1 49421
98.8%
2 400
 
0.8%
4 105
 
0.2%
3 74
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 49421
98.8%
2 400
 
0.8%
4 105
 
0.2%
3 74
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 49421
98.8%
2 400
 
0.8%
4 105
 
0.2%
3 74
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 49421
98.8%
2 400
 
0.8%
4 105
 
0.2%
3 74
 
0.1%

occupancy
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
P
46146 
S
 
2216
I
 
1638

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowP
2nd rowP
3rd rowP
4th rowP
5th rowP

Common Values

ValueCountFrequency (%)
P 46146
92.3%
S 2216
 
4.4%
I 1638
 
3.3%

Length

2023-11-13T11:39:57.297309image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T11:39:57.430024image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
p 46146
92.3%
s 2216
 
4.4%
i 1638
 
3.3%

Most occurring characters

ValueCountFrequency (%)
P 46146
92.3%
S 2216
 
4.4%
I 1638
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 50000
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 46146
92.3%
S 2216
 
4.4%
I 1638
 
3.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 50000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 46146
92.3%
S 2216
 
4.4%
I 1638
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 46146
92.3%
S 2216
 
4.4%
I 1638
 
3.3%

orig_cltv
Real number (ℝ)

HIGH CORRELATION 

Distinct149
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68.36696
Minimum5
Maximum212
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2023-11-13T11:39:57.591235image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile33
Q157
median73
Q380
95-th percentile95
Maximum212
Range207
Interquartile range (IQR)23

Descriptive statistics

Standard deviation18.340819
Coefficient of variation (CV)0.26827021
Kurtosis0.62136391
Mean68.36696
Median Absolute Deviation (MAD)9
Skewness-0.41936005
Sum3418348
Variance336.38563
MonotonicityNot monotonic
2023-11-13T11:39:57.798652image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
80 8719
 
17.4%
75 2420
 
4.8%
90 1701
 
3.4%
70 1300
 
2.6%
60 1201
 
2.4%
79 1170
 
2.3%
74 1063
 
2.1%
95 1029
 
2.1%
78 1016
 
2.0%
73 975
 
1.9%
Other values (139) 29406
58.8%
ValueCountFrequency (%)
5 1
 
< 0.1%
7 4
 
< 0.1%
8 10
 
< 0.1%
9 3
 
< 0.1%
10 10
 
< 0.1%
11 7
 
< 0.1%
12 19
< 0.1%
13 19
< 0.1%
14 30
0.1%
15 42
0.1%
ValueCountFrequency (%)
212 1
< 0.1%
206 1
< 0.1%
193 1
< 0.1%
183 2
< 0.1%
181 1
< 0.1%
176 1
< 0.1%
173 1
< 0.1%
172 1
< 0.1%
169 1
< 0.1%
151 1
< 0.1%

orig_dti
Real number (ℝ)

Distinct66
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean136.65418
Minimum1
Maximum999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2023-11-13T11:39:58.015321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile14
Q124
median33
Q344
95-th percentile999
Maximum999
Range998
Interquartile range (IQR)20

Descriptive statistics

Standard deviation300.98115
Coefficient of variation (CV)2.2025023
Kurtosis4.3236369
Mean136.65418
Median Absolute Deviation (MAD)10
Skewness2.5119297
Sum6832709
Variance90589.651
MonotonicityNot monotonic
2023-11-13T11:39:58.242610image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
999 5423
 
10.8%
26 1438
 
2.9%
24 1352
 
2.7%
28 1348
 
2.7%
27 1344
 
2.7%
31 1343
 
2.7%
25 1332
 
2.7%
29 1326
 
2.7%
30 1324
 
2.6%
33 1313
 
2.6%
Other values (56) 32457
64.9%
ValueCountFrequency (%)
1 5
 
< 0.1%
2 10
 
< 0.1%
3 18
 
< 0.1%
4 27
 
0.1%
5 52
 
0.1%
6 60
 
0.1%
7 100
 
0.2%
8 144
0.3%
9 229
0.5%
10 250
0.5%
ValueCountFrequency (%)
999 5423
10.8%
65 21
 
< 0.1%
64 38
 
0.1%
63 35
 
0.1%
62 46
 
0.1%
61 39
 
0.1%
60 40
 
0.1%
59 63
 
0.1%
58 57
 
0.1%
57 72
 
0.1%

orig_upb
Real number (ℝ)

Distinct670
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean212505.54
Minimum8000
Maximum790000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2023-11-13T11:39:58.476280image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum8000
5-th percentile68000
Q1123000
median188000
Q3280000
95-th percentile417000
Maximum790000
Range782000
Interquartile range (IQR)157000

Descriptive statistics

Standard deviation115814.41
Coefficient of variation (CV)0.54499479
Kurtosis1.111691
Mean212505.54
Median Absolute Deviation (MAD)74000
Skewness0.98875167
Sum1.0625277 × 1010
Variance1.3412978 × 1010
MonotonicityNot monotonic
2023-11-13T11:39:58.698874image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
417000 1993
 
4.0%
100000 615
 
1.2%
200000 614
 
1.2%
150000 483
 
1.0%
300000 413
 
0.8%
120000 409
 
0.8%
140000 392
 
0.8%
160000 379
 
0.8%
180000 368
 
0.7%
250000 351
 
0.7%
Other values (660) 43983
88.0%
ValueCountFrequency (%)
8000 1
 
< 0.1%
10000 1
 
< 0.1%
13000 1
 
< 0.1%
15000 1
 
< 0.1%
16000 1
 
< 0.1%
17000 3
< 0.1%
18000 2
< 0.1%
19000 3
< 0.1%
20000 2
< 0.1%
21000 1
 
< 0.1%
ValueCountFrequency (%)
790000 1
 
< 0.1%
788000 1
 
< 0.1%
776000 1
 
< 0.1%
730000 66
0.1%
729000 19
 
< 0.1%
728000 5
 
< 0.1%
726000 1
 
< 0.1%
725000 3
 
< 0.1%
722000 3
 
< 0.1%
721000 2
 
< 0.1%

orig_ltv
Real number (ℝ)

HIGH CORRELATION 

Distinct121
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66.5119
Minimum5
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2023-11-13T11:39:58.911549image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile32
Q155
median71
Q380
95-th percentile90
Maximum125
Range120
Interquartile range (IQR)25

Descriptive statistics

Standard deviation17.695917
Coefficient of variation (CV)0.26605641
Kurtosis-0.019401098
Mean66.5119
Median Absolute Deviation (MAD)9
Skewness-0.63517569
Sum3325595
Variance313.14548
MonotonicityNot monotonic
2023-11-13T11:39:59.102041image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
80 9161
 
18.3%
75 2472
 
4.9%
70 1320
 
2.6%
60 1292
 
2.6%
90 1284
 
2.6%
79 1206
 
2.4%
74 1089
 
2.2%
78 1033
 
2.1%
73 979
 
2.0%
69 926
 
1.9%
Other values (111) 29238
58.5%
ValueCountFrequency (%)
5 1
 
< 0.1%
6 1
 
< 0.1%
7 5
 
< 0.1%
8 10
 
< 0.1%
9 3
 
< 0.1%
10 13
 
< 0.1%
11 9
 
< 0.1%
12 21
< 0.1%
13 20
< 0.1%
14 36
0.1%
ValueCountFrequency (%)
125 3
< 0.1%
124 3
< 0.1%
123 2
< 0.1%
122 4
< 0.1%
121 1
 
< 0.1%
120 1
 
< 0.1%
119 2
< 0.1%
118 1
 
< 0.1%
117 3
< 0.1%
116 3
< 0.1%

orig_int
Real number (ℝ)

Distinct260
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.9782446
Minimum3.5
Maximum7.875
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2023-11-13T11:39:59.575904image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum3.5
5-th percentile4.375
Q14.75
median4.875
Q35.25
95-th percentile5.625
Maximum7.875
Range4.375
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation0.38970043
Coefficient of variation (CV)0.07828069
Kurtosis1.2562636
Mean4.9782446
Median Absolute Deviation (MAD)0.25
Skewness0.69814072
Sum248912.23
Variance0.15186642
MonotonicityNot monotonic
2023-11-13T11:39:59.741629image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.875 9112
18.2%
4.75 7781
15.6%
5 4970
9.9%
5.25 4789
9.6%
5.125 3764
7.5%
5.375 3601
 
7.2%
4.5 2985
 
6.0%
4.625 2942
 
5.9%
5.5 2289
 
4.6%
4.375 1917
 
3.8%
Other values (250) 5850
11.7%
ValueCountFrequency (%)
3.5 1
 
< 0.1%
3.75 2
 
< 0.1%
3.875 3
 
< 0.1%
4 9
 
< 0.1%
4.125 11
 
< 0.1%
4.25 1532
3.1%
4.26 2
 
< 0.1%
4.27 1
 
< 0.1%
4.272 1
 
< 0.1%
4.275 1
 
< 0.1%
ValueCountFrequency (%)
7.875 1
 
< 0.1%
7.5 2
 
< 0.1%
7.375 1
 
< 0.1%
7.25 2
 
< 0.1%
7.125 3
 
< 0.1%
7 5
 
< 0.1%
6.875 14
 
< 0.1%
6.75 18
 
< 0.1%
6.625 40
0.1%
6.5 90
0.2%

channel
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
R
29610 
C
12415 
B
7975 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowR
2nd rowR
3rd rowR
4th rowR
5th rowR

Common Values

ValueCountFrequency (%)
R 29610
59.2%
C 12415
24.8%
B 7975
 
16.0%

Length

2023-11-13T11:39:59.901007image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T11:40:00.065819image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
r 29610
59.2%
c 12415
24.8%
b 7975
 
16.0%

Most occurring characters

ValueCountFrequency (%)
R 29610
59.2%
C 12415
24.8%
B 7975
 
16.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 50000
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 29610
59.2%
C 12415
24.8%
B 7975
 
16.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 50000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 29610
59.2%
C 12415
24.8%
B 7975
 
16.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 29610
59.2%
C 12415
24.8%
B 7975
 
16.0%

ppm
Boolean

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size49.0 KiB
False
50000 
ValueCountFrequency (%)
False 50000
100.0%
2023-11-13T11:40:00.247815image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

amort
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
FRM
50000 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters150000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFRM
2nd rowFRM
3rd rowFRM
4th rowFRM
5th rowFRM

Common Values

ValueCountFrequency (%)
FRM 50000
100.0%

Length

2023-11-13T11:40:00.411278image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T11:40:00.579139image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
frm 50000
100.0%

Most occurring characters

ValueCountFrequency (%)
F 50000
33.3%
R 50000
33.3%
M 50000
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 150000
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F 50000
33.3%
R 50000
33.3%
M 50000
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 150000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 50000
33.3%
R 50000
33.3%
M 50000
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 150000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 50000
33.3%
R 50000
33.3%
M 50000
33.3%
Distinct54
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
2023-11-13T11:40:00.809625image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters100000
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMN
2nd rowNY
3rd rowWA
4th rowNE
5th rowNE
ValueCountFrequency (%)
ca 5791
 
11.6%
il 3072
 
6.1%
tx 2164
 
4.3%
nc 1911
 
3.8%
oh 1902
 
3.8%
ny 1889
 
3.8%
pa 1870
 
3.7%
wi 1698
 
3.4%
wa 1672
 
3.3%
fl 1627
 
3.3%
Other values (44) 26404
52.8%
2023-11-13T11:40:01.217237image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 16668
16.7%
C 10544
10.5%
N 10410
10.4%
I 8978
 
9.0%
M 7893
 
7.9%
O 5943
 
5.9%
L 5574
 
5.6%
T 4786
 
4.8%
W 3628
 
3.6%
Y 2814
 
2.8%
Other values (14) 22762
22.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 100000
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 16668
16.7%
C 10544
10.5%
N 10410
10.4%
I 8978
 
9.0%
M 7893
 
7.9%
O 5943
 
5.9%
L 5574
 
5.6%
T 4786
 
4.8%
W 3628
 
3.6%
Y 2814
 
2.8%
Other values (14) 22762
22.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 100000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 16668
16.7%
C 10544
10.5%
N 10410
10.4%
I 8978
 
9.0%
M 7893
 
7.9%
O 5943
 
5.9%
L 5574
 
5.6%
T 4786
 
4.8%
W 3628
 
3.6%
Y 2814
 
2.8%
Other values (14) 22762
22.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 100000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 16668
16.7%
C 10544
10.5%
N 10410
10.4%
I 8978
 
9.0%
M 7893
 
7.9%
O 5943
 
5.9%
L 5574
 
5.6%
T 4786
 
4.8%
W 3628
 
3.6%
Y 2814
 
2.8%
Other values (14) 22762
22.8%

prop_type
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
SF
37242 
PU
9699 
CO
 
2795
CP
 
143
MH
 
121

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters100000
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSF
2nd rowSF
3rd rowSF
4th rowPU
5th rowSF

Common Values

ValueCountFrequency (%)
SF 37242
74.5%
PU 9699
 
19.4%
CO 2795
 
5.6%
CP 143
 
0.3%
MH 121
 
0.2%

Length

2023-11-13T11:40:01.378882image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T11:40:01.531638image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
sf 37242
74.5%
pu 9699
 
19.4%
co 2795
 
5.6%
cp 143
 
0.3%
mh 121
 
0.2%

Most occurring characters

ValueCountFrequency (%)
S 37242
37.2%
F 37242
37.2%
P 9842
 
9.8%
U 9699
 
9.7%
C 2938
 
2.9%
O 2795
 
2.8%
M 121
 
0.1%
H 121
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 100000
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 37242
37.2%
F 37242
37.2%
P 9842
 
9.8%
U 9699
 
9.7%
C 2938
 
2.9%
O 2795
 
2.8%
M 121
 
0.1%
H 121
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 100000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 37242
37.2%
F 37242
37.2%
P 9842
 
9.8%
U 9699
 
9.7%
C 2938
 
2.9%
O 2795
 
2.8%
M 121
 
0.1%
H 121
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 100000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 37242
37.2%
F 37242
37.2%
P 9842
 
9.8%
U 9699
 
9.7%
C 2938
 
2.9%
O 2795
 
2.8%
M 121
 
0.1%
H 121
 
0.1%

prop_zip
Real number (ℝ)

Distinct875
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52283.858
Minimum600
Maximum99900
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2023-11-13T11:40:01.720771image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum600
5-th percentile5400
Q127700
median53000
Q380200
95-th percentile97100
Maximum99900
Range99300
Interquartile range (IQR)52500

Descriptive statistics

Standard deviation29933.93
Coefficient of variation (CV)0.57252718
Kurtosis-1.2294483
Mean52283.858
Median Absolute Deviation (MAD)25700
Skewness4.8581779 × 10-6
Sum2.6141929 × 109
Variance8.9604016 × 108
MonotonicityNot monotonic
2023-11-13T11:40:01.955369image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
94500 565
 
1.1%
60000 507
 
1.0%
84000 436
 
0.9%
60600 432
 
0.9%
60100 423
 
0.8%
75000 412
 
0.8%
98000 409
 
0.8%
30000 408
 
0.8%
60500 383
 
0.8%
92600 325
 
0.7%
Other values (865) 45700
91.4%
ValueCountFrequency (%)
600 12
 
< 0.1%
700 23
 
< 0.1%
800 3
 
< 0.1%
900 32
 
0.1%
1000 79
0.2%
1100 13
 
< 0.1%
1200 16
 
< 0.1%
1300 16
 
< 0.1%
1400 52
0.1%
1500 117
0.2%
ValueCountFrequency (%)
99900 2
 
< 0.1%
99800 11
 
< 0.1%
99700 7
 
< 0.1%
99600 35
 
0.1%
99500 90
0.2%
99400 5
 
< 0.1%
99300 55
0.1%
99200 82
0.2%
99100 14
 
< 0.1%
99000 38
0.1%

loan_id
Text

UNIQUE 

Distinct50000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
2023-11-13T11:40:02.326772image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters600000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50000 ?
Unique (%)100.0%

Sample

1st rowF09Q10000013
2nd rowF09Q10000078
3rd rowF09Q10000148
4th rowF09Q10000154
5th rowF09Q10000180
ValueCountFrequency (%)
f09q10000013 1
 
< 0.1%
f09q10002187 1
 
< 0.1%
f09q10000787 1
 
< 0.1%
f09q10000556 1
 
< 0.1%
f09q10000148 1
 
< 0.1%
f09q10000154 1
 
< 0.1%
f09q10000180 1
 
< 0.1%
f09q10000181 1
 
< 0.1%
f09q10000183 1
 
< 0.1%
f09q10000216 1
 
< 0.1%
Other values (49990) 49990
> 99.9%
2023-11-13T11:40:02.771612image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 134574
22.4%
9 74766
12.5%
F 50000
 
8.3%
Q 50000
 
8.3%
1 47400
 
7.9%
2 47157
 
7.9%
3 47059
 
7.8%
4 44493
 
7.4%
5 28290
 
4.7%
6 26704
 
4.5%
Other values (2) 49557
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 500000
83.3%
Uppercase Letter 100000
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 134574
26.9%
9 74766
15.0%
1 47400
 
9.5%
2 47157
 
9.4%
3 47059
 
9.4%
4 44493
 
8.9%
5 28290
 
5.7%
6 26704
 
5.3%
7 25011
 
5.0%
8 24546
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
F 50000
50.0%
Q 50000
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 500000
83.3%
Latin 100000
 
16.7%

Most frequent character per script

Common
ValueCountFrequency (%)
0 134574
26.9%
9 74766
15.0%
1 47400
 
9.5%
2 47157
 
9.4%
3 47059
 
9.4%
4 44493
 
8.9%
5 28290
 
5.7%
6 26704
 
5.3%
7 25011
 
5.0%
8 24546
 
4.9%
Latin
ValueCountFrequency (%)
F 50000
50.0%
Q 50000
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 600000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 134574
22.4%
9 74766
12.5%
F 50000
 
8.3%
Q 50000
 
8.3%
1 47400
 
7.9%
2 47157
 
7.9%
3 47059
 
7.8%
4 44493
 
7.4%
5 28290
 
4.7%
6 26704
 
4.5%
Other values (2) 49557
 
8.3%

loan_purp
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
N
25554 
C
14123 
P
10323 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowP
3rd rowC
4th rowN
5th rowN

Common Values

ValueCountFrequency (%)
N 25554
51.1%
C 14123
28.2%
P 10323
20.6%

Length

2023-11-13T11:40:02.917885image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T11:40:03.071019image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
n 25554
51.1%
c 14123
28.2%
p 10323
20.6%

Most occurring characters

ValueCountFrequency (%)
N 25554
51.1%
C 14123
28.2%
P 10323
20.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 50000
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 25554
51.1%
C 14123
28.2%
P 10323
20.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 50000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 25554
51.1%
C 14123
28.2%
P 10323
20.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 25554
51.1%
C 14123
28.2%
P 10323
20.6%

orig_loan_term
Real number (ℝ)

HIGH CORRELATION 

Distinct121
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean320.68482
Minimum60
Maximum360
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size390.8 KiB
2023-11-13T11:40:03.251387image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum60
5-th percentile180
Q1360
median360
Q3360
95-th percentile360
Maximum360
Range300
Interquartile range (IQR)0

Descriptive statistics

Standard deviation73.252147
Coefficient of variation (CV)0.22842412
Kurtosis0.14735424
Mean320.68482
Median Absolute Deviation (MAD)0
Skewness-1.4113492
Sum16034241
Variance5365.877
MonotonicityNot monotonic
2023-11-13T11:40:03.499107image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
360 38238
76.5%
180 8806
 
17.6%
240 1637
 
3.3%
120 488
 
1.0%
300 431
 
0.9%
144 29
 
0.1%
324 23
 
< 0.1%
336 22
 
< 0.1%
168 18
 
< 0.1%
156 18
 
< 0.1%
Other values (111) 290
 
0.6%
ValueCountFrequency (%)
60 1
 
< 0.1%
96 1
 
< 0.1%
101 1
 
< 0.1%
119 1
 
< 0.1%
120 488
1.0%
121 14
 
< 0.1%
130 1
 
< 0.1%
131 1
 
< 0.1%
132 8
 
< 0.1%
135 1
 
< 0.1%
ValueCountFrequency (%)
360 38238
76.5%
359 5
 
< 0.1%
358 2
 
< 0.1%
357 2
 
< 0.1%
356 4
 
< 0.1%
355 2
 
< 0.1%
354 4
 
< 0.1%
353 4
 
< 0.1%
352 1
 
< 0.1%
351 2
 
< 0.1%

borrower_num
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
2
30704 
1
19290 
99
 
6

Length

Max length2
Median length1
Mean length1.00012
Min length1

Characters and Unicode

Total characters50006
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row2

Common Values

ValueCountFrequency (%)
2 30704
61.4%
1 19290
38.6%
99 6
 
< 0.1%

Length

2023-11-13T11:40:03.712084image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T11:40:03.891877image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
2 30704
61.4%
1 19290
38.6%
99 6
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
2 30704
61.4%
1 19290
38.6%
9 12
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50006
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 30704
61.4%
1 19290
38.6%
9 12
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 50006
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 30704
61.4%
1 19290
38.6%
9 12
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50006
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 30704
61.4%
1 19290
38.6%
9 12
 
< 0.1%

seller
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
WELLS FARGO BANK, N.A.
13171 
Other sellers
11972 
BANK OF AMERICA, N.A.
4522 
U.S. BANK N.A.
3762 
CHASE HOME FINANCE LLC
3580 
Other values (12)
12993 

Length

Max length52
Median length38
Mean length20.50626
Min length12

Characters and Unicode

Total characters1025313
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOther sellers
2nd rowOther sellers
3rd rowOther sellers
4th rowOther sellers
5th rowOther sellers

Common Values

ValueCountFrequency (%)
WELLS FARGO BANK, N.A. 13171
26.3%
Other sellers 11972
23.9%
BANK OF AMERICA, N.A. 4522
 
9.0%
U.S. BANK N.A. 3762
 
7.5%
CHASE HOME FINANCE LLC 3580
 
7.2%
BRANCH BANKING & TRUST COMPANY 2261
 
4.5%
PROVIDENT FUNDING ASSOCIATES, L.P. 1894
 
3.8%
CITIMORTGAGE, INC. 1655
 
3.3%
FIFTH THIRD BANK 1541
 
3.1%
METLIFE HOME LOANS, A DIVISION OF METLIFE BANK, N.A. 1440
 
2.9%
Other values (7) 4202
 
8.4%

Length

2023-11-13T11:40:04.073228image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
bank 25790
14.8%
n.a 22895
13.1%
wells 13171
 
7.6%
fargo 13171
 
7.6%
other 11972
 
6.9%
sellers 11972
 
6.9%
of 5962
 
3.4%
home 5020
 
2.9%
llc 4576
 
2.6%
america 4522
 
2.6%
Other values (34) 55132
31.7%

Most occurring characters

ValueCountFrequency (%)
124183
 
12.1%
A 101402
 
9.9%
N 78263
 
7.6%
. 60210
 
5.9%
O 52422
 
5.1%
E 45396
 
4.4%
L 42875
 
4.2%
e 35916
 
3.5%
R 35034
 
3.4%
S 34751
 
3.4%
Other values (22) 414861
40.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 679229
66.2%
Lowercase Letter 131692
 
12.8%
Space Separator 124183
 
12.1%
Other Punctuation 90209
 
8.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 101402
14.9%
N 78263
11.5%
O 52422
 
7.7%
E 45396
 
6.7%
L 42875
 
6.3%
R 35034
 
5.2%
S 34751
 
5.1%
I 33536
 
4.9%
B 31479
 
4.6%
F 31431
 
4.6%
Other values (12) 192640
28.4%
Lowercase Letter
ValueCountFrequency (%)
e 35916
27.3%
r 23944
18.2%
s 23944
18.2%
l 23944
18.2%
h 11972
 
9.1%
t 11972
 
9.1%
Other Punctuation
ValueCountFrequency (%)
. 60210
66.7%
, 27002
29.9%
& 2997
 
3.3%
Space Separator
ValueCountFrequency (%)
124183
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 810921
79.1%
Common 214392
 
20.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 101402
 
12.5%
N 78263
 
9.7%
O 52422
 
6.5%
E 45396
 
5.6%
L 42875
 
5.3%
e 35916
 
4.4%
R 35034
 
4.3%
S 34751
 
4.3%
I 33536
 
4.1%
B 31479
 
3.9%
Other values (18) 319847
39.4%
Common
ValueCountFrequency (%)
124183
57.9%
. 60210
28.1%
, 27002
 
12.6%
& 2997
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1025313
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
124183
 
12.1%
A 101402
 
9.9%
N 78263
 
7.6%
. 60210
 
5.9%
O 52422
 
5.1%
E 45396
 
4.4%
L 42875
 
4.2%
e 35916
 
3.5%
R 35034
 
3.4%
S 34751
 
3.4%
Other values (22) 414861
40.5%

servicer
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
Other servicers
14697 
WELLS FARGO BANK, N.A.
13243 
U.S. BANK N.A.
4689 
BANK OF AMERICA, N.A.
4304 
JPMORGAN CHASE BANK, N.A.
3627 
Other values (14)
9440 

Length

Max length52
Median length41
Mean length20.57748
Min length11

Characters and Unicode

Total characters1028874
Distinct characters35
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowU.S. BANK N.A.
2nd rowOther servicers
3rd rowU.S. BANK N.A.
4th rowOther servicers
5th rowOther servicers

Common Values

ValueCountFrequency (%)
Other servicers 14697
29.4%
WELLS FARGO BANK, N.A. 13243
26.5%
U.S. BANK N.A. 4689
 
9.4%
BANK OF AMERICA, N.A. 4304
 
8.6%
JPMORGAN CHASE BANK, N.A. 3627
 
7.3%
PROVIDENT FUNDING ASSOCIATES, L.P. 1837
 
3.7%
BRANCH BANKING & TRUST COMPANY 1772
 
3.5%
CITIMORTGAGE, INC. 1403
 
2.8%
FIFTH THIRD BANK 986
 
2.0%
METLIFE HOME LOANS, A DIVISION OF METLIFE BANK, N.A. 928
 
1.9%
Other values (9) 2514
 
5.0%

Length

2023-11-13T11:40:04.254874image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
bank 28961
17.3%
n.a 26791
16.0%
other 14697
 
8.8%
servicers 14697
 
8.8%
wells 13243
 
7.9%
fargo 13243
 
7.9%
of 5232
 
3.1%
u.s 4689
 
2.8%
america 4304
 
2.6%
jpmorgan 4287
 
2.6%
Other values (35) 37162
22.2%

Most occurring characters

ValueCountFrequency (%)
117306
 
11.4%
A 103663
 
10.1%
N 80797
 
7.9%
. 68590
 
6.7%
O 51829
 
5.0%
e 44091
 
4.3%
r 44091
 
4.3%
S 34473
 
3.4%
E 33452
 
3.3%
R 32685
 
3.2%
Other values (25) 417897
40.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 622377
60.5%
Lowercase Letter 191061
 
18.6%
Space Separator 117306
 
11.4%
Other Punctuation 98130
 
9.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 103663
16.7%
N 80797
13.0%
O 51829
 
8.3%
S 34473
 
5.5%
E 33452
 
5.4%
R 32685
 
5.3%
L 32586
 
5.2%
B 32505
 
5.2%
K 30733
 
4.9%
G 26913
 
4.3%
Other values (13) 162741
26.1%
Lowercase Letter
ValueCountFrequency (%)
e 44091
23.1%
r 44091
23.1%
s 29394
15.4%
c 14697
 
7.7%
i 14697
 
7.7%
v 14697
 
7.7%
h 14697
 
7.7%
t 14697
 
7.7%
Other Punctuation
ValueCountFrequency (%)
. 68590
69.9%
, 27768
28.3%
& 1772
 
1.8%
Space Separator
ValueCountFrequency (%)
117306
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 813438
79.1%
Common 215436
 
20.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 103663
 
12.7%
N 80797
 
9.9%
O 51829
 
6.4%
e 44091
 
5.4%
r 44091
 
5.4%
S 34473
 
4.2%
E 33452
 
4.1%
R 32685
 
4.0%
L 32586
 
4.0%
B 32505
 
4.0%
Other values (21) 323266
39.7%
Common
ValueCountFrequency (%)
117306
54.5%
. 68590
31.8%
, 27768
 
12.9%
& 1772
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1028874
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
117306
 
11.4%
A 103663
 
10.1%
N 80797
 
7.9%
. 68590
 
6.7%
O 51829
 
5.0%
e 44091
 
4.3%
r 44091
 
4.3%
S 34473
 
3.4%
E 33452
 
3.3%
R 32685
 
3.2%
Other values (25) 417897
40.6%

super_conforming
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)0.1%
Missing48944
Missing (%)97.9%
Memory size97.8 KiB
True
 
1056
(Missing)
48944 
ValueCountFrequency (%)
True 1056
 
2.1%
(Missing) 48944
97.9%
2023-11-13T11:40:04.407007image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

prr_loan_seq_num
Text

MISSING 

Distinct5315
Distinct (%)100.0%
Missing44685
Missing (%)89.4%
Memory size390.8 KiB
2023-11-13T11:40:04.653176image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters63780
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5315 ?
Unique (%)100.0%

Sample

1st rowF06Q10344882
2nd rowF04Q20024019
3rd rowF06Q30357730
4th rowF07Q40192786
5th rowF05Q30012809
ValueCountFrequency (%)
f04q10332219 1
 
< 0.1%
f05q20004455 1
 
< 0.1%
f06q30357730 1
 
< 0.1%
f07q40192786 1
 
< 0.1%
f05q30012809 1
 
< 0.1%
a06q30002868 1
 
< 0.1%
f06q40000894 1
 
< 0.1%
f06q20001301 1
 
< 0.1%
f04q30007791 1
 
< 0.1%
f06q40005198 1
 
< 0.1%
Other values (5305) 5305
99.8%
2023-11-13T11:40:05.168736image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 14727
23.1%
3 5453
 
8.5%
2 5419
 
8.5%
Q 5315
 
8.3%
1 5254
 
8.2%
F 4970
 
7.8%
4 4849
 
7.6%
7 3883
 
6.1%
6 3753
 
5.9%
8 3707
 
5.8%
Other values (3) 6450
10.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 53150
83.3%
Uppercase Letter 10630
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 14727
27.7%
3 5453
 
10.3%
2 5419
 
10.2%
1 5254
 
9.9%
4 4849
 
9.1%
7 3883
 
7.3%
6 3753
 
7.1%
8 3707
 
7.0%
5 3372
 
6.3%
9 2733
 
5.1%
Uppercase Letter
ValueCountFrequency (%)
Q 5315
50.0%
F 4970
46.8%
A 345
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
Common 53150
83.3%
Latin 10630
 
16.7%

Most frequent character per script

Common
ValueCountFrequency (%)
0 14727
27.7%
3 5453
 
10.3%
2 5419
 
10.2%
1 5254
 
9.9%
4 4849
 
9.1%
7 3883
 
7.3%
6 3753
 
7.1%
8 3707
 
7.0%
5 3372
 
6.3%
9 2733
 
5.1%
Latin
ValueCountFrequency (%)
Q 5315
50.0%
F 4970
46.8%
A 345
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 63780
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 14727
23.1%
3 5453
 
8.5%
2 5419
 
8.5%
Q 5315
 
8.3%
1 5254
 
8.2%
F 4970
 
7.8%
4 4849
 
7.6%
7 3883
 
6.1%
6 3753
 
5.9%
8 3707
 
5.8%
Other values (3) 6450
10.1%

program
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
9
50000 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9
2nd row9
3rd row9
4th row9
5th row9

Common Values

ValueCountFrequency (%)
9 50000
100.0%

Length

2023-11-13T11:40:05.357792image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T11:40:05.526039image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
9 50000
100.0%

Most occurring characters

ValueCountFrequency (%)
9 50000
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 50000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 50000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 50000
100.0%

relief_refi
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)< 0.1%
Missing44685
Missing (%)89.4%
Memory size97.8 KiB
True
5315 
(Missing)
44685 
ValueCountFrequency (%)
True 5315
 
10.6%
(Missing) 44685
89.4%
2023-11-13T11:40:05.644042image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

prop_val
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
9
50000 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9
2nd row9
3rd row9
4th row9
5th row9

Common Values

ValueCountFrequency (%)
9 50000
100.0%

Length

2023-11-13T11:40:05.756449image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T11:40:05.915377image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
9 50000
100.0%

Most occurring characters

ValueCountFrequency (%)
9 50000
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 50000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 50000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 50000
100.0%

interest_only
Boolean

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size49.0 KiB
False
50000 
ValueCountFrequency (%)
False 50000
100.0%
2023-11-13T11:40:06.040029image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

MI_cancel
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
9
50000 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9
2nd row9
3rd row9
4th row9
5th row9

Common Values

ValueCountFrequency (%)
9 50000
100.0%

Length

2023-11-13T11:40:06.168543image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-11-13T11:40:06.310188image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
9 50000
100.0%

Most occurring characters

ValueCountFrequency (%)
9 50000
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 50000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 50000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 50000
100.0%

Interactions

2023-11-13T11:39:49.585903image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:22.854028image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:25.306981image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:27.898529image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:30.339351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:32.670597image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:35.066482image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:37.479669image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:39.838010image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:42.288439image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:44.659170image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:46.977377image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:49.814016image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:23.070375image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:25.555906image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:28.076615image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:30.559557image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:32.860762image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:35.237952image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:37.658809image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:40.022138image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:42.497444image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:44.821254image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:47.142147image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:50.020193image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:23.420025image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:25.806596image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:28.271769image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:30.801843image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:33.085163image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:35.462461image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:37.852159image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:40.274148image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:42.754057image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:45.031484image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:47.313754image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:50.212888image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:23.586592image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:26.005540image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:28.494972image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:30.995008image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:33.274288image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:35.684283image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:38.175658image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:40.479250image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:42.971111image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:45.235757image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:47.496005image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:50.429546image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:23.751660image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:26.203632image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:28.712286image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:31.193364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:33.488405image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:35.914605image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:38.326819image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:40.738880image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:43.182239image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:45.431885image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:47.706114image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:50.626292image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:23.943779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:26.392177image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:28.902952image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:31.384470image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:33.737986image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:36.128336image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:38.491868image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:40.960941image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:43.375430image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:45.617060image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:48.201530image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:50.824876image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:24.122649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:26.605823image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:29.060542image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:31.555539image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:33.962364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:36.311069image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:38.637956image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:41.141332image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:43.547507image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:45.819141image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:48.386685image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:51.102145image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:24.298743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:26.830678image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:29.210645image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:31.734691image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:34.155517image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:36.488984image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:38.818779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:41.309872image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:43.714571image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:46.035284image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:48.567834image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:51.402631image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:24.493370image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:27.083818image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:29.383283image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:31.949756image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:34.362029image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:36.694303image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:39.074882image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:41.508955image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:43.900768image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:46.273406image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:48.780468image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:51.651856image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:24.679753image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:27.293998image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:29.543387image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:32.152822image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:34.550148image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:36.876731image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:39.285003image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:41.689061image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:44.101669image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:46.454828image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:48.957616image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:51.904215image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:24.862334image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:27.494706image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:29.689961image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:32.321971image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:34.722811image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:37.063435image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:39.470273image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:41.885176image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:44.297478image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:46.627692image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:49.138893image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:52.135403image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:25.059705image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:27.699844image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:30.071496image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:32.497527image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:34.898377image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:37.269211image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:39.652323image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:42.086293image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:44.479879image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:46.801294image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2023-11-13T11:39:49.343422image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2023-11-13T11:40:06.450216image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
credit_scorefirst_paymat_datemsamortgage_ins_pctorig_cltvorig_dtiorig_upborig_ltvorig_intprop_ziporig_loan_termfhaunit_numoccupancychannelprop_typeloan_purpborrower_numsellerservicer
credit_score1.000-0.052-0.0770.020-0.068-0.183-0.217-0.001-0.167-0.1770.034-0.0500.0120.0000.0000.0050.0000.0190.0060.0000.000
first_pay-0.0521.0000.5730.0120.0060.1040.172-0.0500.0990.1350.001-0.0570.0120.0000.0000.0260.0000.0230.0050.0680.066
mat_date-0.0770.5731.0000.0410.0730.2400.1800.1320.2350.4600.0450.7400.0730.0000.0380.0690.0530.1440.0580.0790.070
msa0.0200.0120.0411.000-0.027-0.0500.0250.056-0.0440.0120.1880.0430.0250.0430.0480.0650.1160.0550.0030.0840.079
mortgage_ins_pct-0.0680.0060.073-0.0271.0000.3980.019-0.0030.4210.071-0.0170.0890.1540.0100.0500.0590.0160.1880.0310.0540.041
orig_cltv-0.1830.1040.240-0.0500.3981.0000.2240.1360.9410.224-0.0160.2140.1280.0130.0690.0780.0250.2350.0530.0460.035
orig_dti-0.2170.1720.1800.0250.0190.2241.0000.0830.1980.1790.0300.0940.0910.0000.0210.2560.0310.3350.0140.2290.196
orig_upb-0.001-0.0500.1320.056-0.0030.1360.0831.0000.102-0.0710.0510.2200.0190.0560.1030.1070.0720.0760.1040.0720.077
orig_ltv-0.1670.0990.235-0.0440.4210.9410.1980.1021.0000.234-0.0140.2100.1770.0330.0870.0670.0290.2730.0690.0480.037
orig_int-0.1770.1350.4600.0120.0710.2240.179-0.0710.2341.0000.0140.4430.0620.0580.2310.0270.0330.1260.0730.0720.065
prop_zip0.0340.0010.0450.188-0.017-0.0160.0300.051-0.0140.0141.0000.0630.0480.0520.0870.1460.1660.0880.0440.1970.175
orig_loan_term-0.050-0.0570.7400.0430.0890.2140.0940.2200.2100.4430.0631.0000.0720.0000.0380.0680.0510.1410.0580.0770.069
fha0.0120.0120.0730.0250.1540.1280.0910.0190.1770.0620.0480.0721.0000.0000.0530.0390.0690.3650.3000.0920.084
unit_num0.0000.0000.0000.0430.0100.0130.0000.0560.0330.0580.0520.0000.0001.0000.1890.0080.0340.0240.0130.0230.023
occupancy0.0000.0000.0380.0480.0500.0690.0210.1030.0870.2310.0870.0380.0530.1891.0000.0230.0800.1200.0290.0570.050
channel0.0050.0260.0690.0650.0590.0780.2560.1070.0670.0270.1460.0680.0390.0080.0231.0000.0660.1070.0250.5190.401
prop_type0.0000.0000.0530.1160.0160.0250.0310.0720.0290.0330.1660.0510.0690.0340.0800.0661.0000.1050.0840.1080.103
loan_purp0.0190.0230.1440.0550.1880.2350.3350.0760.2730.1260.0880.1410.3650.0240.1200.1070.1051.0000.0690.1100.101
borrower_num0.0060.0050.0580.0030.0310.0530.0140.1040.0690.0730.0440.0580.3000.0130.0290.0250.0840.0691.0000.0460.032
seller0.0000.0680.0790.0840.0540.0460.2290.0720.0480.0720.1970.0770.0920.0230.0570.5190.1080.1100.0461.0000.779
servicer0.0000.0660.0700.0790.0410.0350.1960.0770.0370.0650.1750.0690.0840.0230.0500.4010.1030.1010.0320.7791.000

Missing values

2023-11-13T11:39:52.538698image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-11-13T11:39:53.687135image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-11-13T11:39:54.269700image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

credit_scorefirst_payfhamat_datemsamortgage_ins_pctunit_numoccupancyorig_cltvorig_dtiorig_upborig_ltvorig_intchannelppmamortprop_stateprop_typeprop_ziploan_idloan_purporig_loan_termborrower_numsellerservicersuper_conformingprr_loan_seq_numprogramrelief_refiprop_valinterest_onlyMI_cancel
0795200903N202402NaN01P6012157000604.500RNFRMMNSF56600F09Q10000013C1802Other sellersU.S. BANK N.A.NaNNaN9NaN9N9
1766200903Y20390224020.001P8031128000805.750RNFRMNYSF12800F09Q10000078P3601Other sellersOther servicersNaNNaN9NaN9N9
2701200903N20390213380.001P7461162000745.625RNFRMWASF98200F09Q10000148C3601Other sellersU.S. BANK N.A.NaNNaN9NaN9N9
3791200903N20390236540.001P6438184000645.625RNFRMNEPU68100F09Q10000154N3601Other sellersOther servicersNaNNaN9NaN9N9
4725200903N20390236540.001P8529130000805.500RNFRMNESF68100F09Q10000180N3602Other sellersOther servicersNaNNaN9NaN9N9
5770200904N20390341500.001P7525195000755.125RNFRMCAPU93900F09Q10000181C3602Other sellersU.S. BANK N.A.NaNNaN9NaN9N9
6805200905N20390429100.001P7538108000755.125BNFRMWISF54600F09Q10000183C3601Other sellersOther servicersNaNNaN9NaN9N9
7645200904N203903NaN01P7241160000725.625RNFRMKYSF40700F09Q10000216C3602Other sellersOther servicersNaNNaN9NaN9N9
8763200903N20390223844.001P8027148000805.500RNFRMINSF46300F09Q10000255C3602Other sellersU.S. BANK N.A.NaNNaN9NaN9N9
9748200903N203902NaN01S8043320000805.625RNFRMMISF49700F09Q10000449P3602Other sellersCENTRAL MORTGAGE COMPANYNaNNaN9NaN9N9
credit_scorefirst_payfhamat_datemsamortgage_ins_pctunit_numoccupancyorig_cltvorig_dtiorig_upborig_ltvorig_intchannelppmamortprop_stateprop_typeprop_ziploan_idloan_purporig_loan_termborrower_numsellerservicersuper_conformingprr_loan_seq_numprogramrelief_refiprop_valinterest_onlyMI_cancel
49990741201001N20391211244.001P2016153000205.125RNFRMCASF90700F09Q40451505N3601Other sellersOther servicersNaNNaN9NaN9N9
49991734200912Y20391117900.001P803652000805.125RNFRMSCSF29000F09Q40451550P3601BRANCH BANKING & TRUST COMPANYOther servicersNaNNaN9NaN9N9
49992658201002Y20400117900.001P9742114000975.250RNFRMSCSF29200F09Q40451574P3601BRANCH BANKING & TRUST COMPANYOther servicersNaNNaN9NaN9N9
49993709201001Y203912NaN01P10036760001005.375RNFRMVASF22600F09Q40451608P3601BRANCH BANKING & TRUST COMPANYOther servicersNaNNaN9NaN9N9
49994784201002Y20400140220.001P10025780001005.250RNFRMVASF24100F09Q40451610P3601BRANCH BANKING & TRUST COMPANYOther servicersNaNNaN9NaN9N9
49995749201002Y20400116740.001P100381210001004.875RNFRMNCSF28000F09Q40451618P3601Other sellersOther servicersNaNNaN9NaN9N9
49996749201001Y20391219060.001P10043900001005.250RNFRMWVSF26700F09Q40451777P3601Other sellersOther servicersNaNNaN9NaN9N9
49997775201001Y20391217900.001P100341260001005.000RNFRMSCSF29200F09Q40451784P3601Other sellersOther servicersNaNNaN9NaN9N9
49998709200912Y20391116740.001P100401040001005.250RNFRMNCSF28000F09Q40451811P3601Other sellersOther servicersNaNNaN9NaN9N9
49999705201002Y204001NaN01P933465000934.875RNFRMPASF15900F09Q40451949P3602Other sellersOther servicersNaNNaN9NaN9N9